Diversified SVM Ensembles for Large Data Sets
نویسندگان
چکیده
Recently, the core vector machine (CVM) has shown significant speedups on classification and regression problems with massive data sets. Its performance is also almost as accurate as other state-ofthe-art SVM implementations. By incorporating the orthogonality constraints to diversify the CVM ensembles, this turns out to speed up the maximum margin discriminant analysis (MMDA) algorithm. Extensive comparisons with the MMDA ensemble along with bagging on a number of large data sets show that the proposed diversified CVM ensemble can improve classification performance, and is also faster than the original MMDA algorithm by more than an order of magnitude.
منابع مشابه
SVM Ensembles Are Better When Different Kernel Types Are Combined
Support Vector Machines (SVM) are strong classifiers, but large data sets might lead to prohibitively long computation times and high memory requirements. SVM ensembles, where each single SVM sees only a fraction of the data, can be an approach to overcome this barrier. In continuation of related work in this field we construct SVM ensembles with Bagging and Boosting. As a new idea we analyze S...
متن کاملSVM and SVM Ensembles in Breast Cancer Prediction
Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary ...
متن کاملEmpirical analysis of support vector machine ensemble classifiers
Ensemble classification – combining the results of a set of base learners – has received much attention in the machine learning community and has demonstrated promising capabilities in improving classification accuracy. Compared with neural network or decision tree ensembles, there is no comprehensive empirical research in support vector machine (SVM) ensembles. To fill this void, this paper an...
متن کاملUsing Attribute Behavior Diversity to Build Accurate Decision Tree Committees for microarray Data
DNA microarrays (gene chips), frequently used in biological and medical studies, measure the expressions of thousands of genes per sample. Using microarray data to build accurate classifiers for diseases is an important task. This paper introduces an algorithm, called Committee of Decision Trees by Attribute Behavior Diversity (CABD), to build highly accurate ensembles of decision trees for suc...
متن کاملEnhanced Classification Accuracy for Cardiotocogram Data with Ensemble Feature Selection and Classifier Ensemble
In this paper ensemble learning based feature selection and classifier ensemble model is proposed to improve classification accuracy. The hypothesis is that good feature sets contain features that are highly correlated with the class from ensemble feature selection to SVM ensembles which can be achieved on the performance of classification accuracy. The proposed approach consists of two phases:...
متن کامل